AITopics | empirical fisher approximation

46a558d97954d0692411c861cf78ef79-Paper.pdf

Neural Information Processing SystemsFeb-12-2026, 01:37:27 GMT

approximation, empirical fisher, fisher, (12 more...)

Neural Information Processing Systems

Country:

Europe > Sweden > Stockholm > Stockholm (0.05)
North America > United States > Indiana > Hamilton County > Fishers (0.04)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
(12 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.97)

Add feedback

Limitations of the empirical Fisher approximation for natural gradient descent

Neural Information Processing SystemsDec-25-2025, 08:06:29 GMT

Natural gradient descent, which preconditions a gradient descent update with the Fisher information matrix of the underlying statistical model, is a way to capture partial second-order information. Several highly visible works have advocated an approximation known as the empirical Fisher, drawing connections between approximate second-order methods and heuristics like Adam. We dispute this argument by showing that the empirical Fisher---unlike the Fisher---does not generally capture second-order information. We further argue that the conditions under which the empirical Fisher approaches the Fisher (and the Hessian) are unlikely to be met in practice, and that, even on simple optimization problems, the pathologies of the empirical Fisher can have undesirable effects.

empirical fisher approximation, gradient descent, name change, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.71)

Add feedback

Limitations of the Empirical Fisher Approximation for Natural Gradient Descent

Neural Information Processing SystemsOct-2-2025, 15:51:48 GMT

Several highly visible works have advocated an approximation known as the empirical Fisher, drawing connections between approximate second-order methods and heuristics like Adam.

artificial intelligence, fisher, machine learning, (13 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > United States > California (0.28)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.53)

Add feedback

Reviews: Limitations of the empirical Fisher approximation for natural gradient descent

Neural Information Processing SystemsJan-23-2025, 09:11:07 GMT

Originality: the paper lacks a sound and novel contribution. Theoretically, there is only one minor result as stated above. Technically, there is not a systematical experimental study on real deep networks. The main contribution is on discussing two different formulations of the Fisher matrix. The main trick on making these two formulations different (despite that the authors took a sophisticated approach going though GGN) is that the so called empirical Fisher relies on y_n (target of neural network output), and if one consider y_n to be randomly distributed with fixed variance based on the neural network output, the two formulations are equivalent, otherwise there is a scale parameter in eq.(3) which is shrinking making the two formulations different because of the shrinking and damping.

contribution, empirical fisher approximation, natural gradient descent, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.40)

Add feedback

Reviews: Limitations of the empirical Fisher approximation for natural gradient descent

Neural Information Processing SystemsJan-23-2025, 09:10:56 GMT

All reviewers were positive about the paper. The paper corrects several common incorrect assertions and misleading derivations in the natural gradient algorithms literature. The exposition is remarkably clear, with a potential to serve as a reference paper on the topic. The paper is clearly of broad interest to the machine learning community. We recommend to take the reviewers' comments and suggestions into account while preparing the camera ready final version of the paper.

empirical fisher approximation, limitation, natural gradient descent, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.40)

Add feedback

Limitations of the empirical Fisher approximation for natural gradient descent

Neural Information Processing SystemsOct-9-2024, 23:06:25 GMT

Natural gradient descent, which preconditions a gradient descent update with the Fisher information matrix of the underlying statistical model, is a way to capture partial second-order information. Several highly visible works have advocated an approximation known as the empirical Fisher, drawing connections between approximate second-order methods and heuristics like Adam. We dispute this argument by showing that the empirical Fisher---unlike the Fisher---does not generally capture second-order information. We further argue that the conditions under which the empirical Fisher approaches the Fisher (and the Hessian) are unlikely to be met in practice, and that, even on simple optimization problems, the pathologies of the empirical Fisher can have undesirable effects.

empirical fisher approximation, gradient descent, natural gradient descent, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.95)

Add feedback

Limitations of the empirical Fisher approximation for natural gradient descent

Kunstner, Frederik, Hennig, Philipp, Balles, Lukas

Neural Information Processing SystemsMar-18-2020, 22:03:18 GMT

Natural gradient descent, which preconditions a gradient descent update with the Fisher information matrix of the underlying statistical model, is a way to capture partial second-order information. Several highly visible works have advocated an approximation known as the empirical Fisher, drawing connections between approximate second-order methods and heuristics like Adam. We dispute this argument by showing that the empirical Fisher---unlike the Fisher---does not generally capture second-order information. We further argue that the conditions under which the empirical Fisher approaches the Fisher (and the Hessian) are unlikely to be met in practice, and that, even on simple optimization problems, the pathologies of the empirical Fisher can have undesirable effects. Papers published at the Neural Information Processing Systems Conference.

empirical fisher approximation, gradient descent, natural gradient descent, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.95)

Add feedback

Limitations of the Empirical Fisher Approximation

Kunstner, Frederik, Balles, Lukas, Hennig, Philipp

arXiv.org Machine LearningMay-29-2019

Natural gradient descent, which preconditions a gradient descent update with the Fisher information matrix of the underlying statistical model, is a way to capture partial second-order information. Several highly visible works have advocated an approximation known as the empirical Fisher, drawing connections between approximate second-order methods and heuristics like Adam. We dispute this argument by showing that the empirical Fisher---unlike the Fisher---does not generally capture second-order information. We further argue that the conditions under which the empirical Fisher approaches the Fisher (and the Hessian) are unlikely to be met in practice, and that, even on simple optimization problems, the pathologies of the empirical Fisher can have undesirable effects.

artificial intelligence, fisher, machine learning, (16 more...)

arXiv.org Machine Learning

1905.12558

Country: